Fast DNA sequencing via transverse electronic transport 
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A rapid and low-cost method to sequence DNA would usher in a revolution in medicine. We 
propose and theoretically show the feasibility of a protocol for sequencing based on the distributions 
of transverse electrical currents of single-stranded DNA while it translocates through a nanopore. 
Our estimates, based on the statistics of these distributions, reveal that sequencing of an entire 
human genome could be done with very high accuracy in a matter of hours without parallelization, 
e.g., orders of magnitude faster than present techniques. The practical implementation of our 
approach would represent a substantial advancement in our ability to study, predict and cure diseases 
from the perspective of the genetic makeup of each individual. 
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Recent innovations in manufacturing processes have 
made it possible to fabricate devices with pores at the 
nanometer scale Q, 0, 0, 0, @, i.e., the scale of in- 
dividual nucleotides. This opens up fascinating new 
venues for sequencing DNA. For instance, one sug- 
gested method is to measure the so called blockade cur- 
rent 00!SinilHII!IIilIllIllIlliaEIIii. 

In this method, a longitudinal electric field is applied to 
pull DNA through a pore. As the DNA goes through, a 
significant fraction of ions is blocked from simultaneously 
entering the pore. By continuously measuring the ionic 
current, single molecules of DNA can be detected. Other 
methods using different detection schemes, ranging from 
optical [13] to capacitive [HI, have also been suggested. 
Despite much effort, however, single nucleotide resolution 
has not yet been achieved [22j|. 

In this Letter, we explore an alternative idea which 
would allow single-base resolution by measuring the 
electrical current perpendicular to the DNA backbone 
while a single strand immersed in a solution translocates 
through a pore. To do this, we envision embedding elec- 
trodes in the walls of a nanopore as schematically shown 
in the inset of Figure 1. The realization of such a con- 
figuration, while difficult to achieve in practice, is within 
reach of present experimental capabilities U Q, y, LJ- LJ • 
The DNA is sequenced by using the measured current as 
an electronic signature of the bases as they pass through 
the pore. We couple molecular dynamics simulations and 
quantum mechanical current calculations to examine the 
feasibility of this approach. We find that if some con- 
trol is exerted over the DNA dynamics, the distributions 
of current values for each nucleotide will be sufficiently 
different to allow for rapid sequencing. We show that a 
transverse field of the same magnitude as that driving 
the current provides sufficient control. 

We first discuss an idealized case of DNA dynamics 
which sets the foundations for the approach we describe. 
Second, we look at the distributions of transverse cur- 
rents through the nucleotides in a realistic setting us- 
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FIG. 1: Transverse current versus time (in arbitrary units) of 
a highly idealized single strand of DNA translocating through 
a nanopore with a constant motion. The sequence of the 
single strand is AGCATCGCTC. The left inset shows a top- 
view schematic of the pore cross section with four electrodes 
(represented by gold rectangles). The right inset shows an 
atomistic side view of the idealized single strand of DNA and 
one set of gold electrodes across which electrical current is 
calculated. The boxes show half the time each nucleotide 
spends in the junction. Within each box, a unique signal 
from each of the bases can be seen. 



ing a combination of quantum-mechanical calculations 
of current and molecular dynamics simulations of DNA 
translocation through the pore. We use a Green's func- 
tion method to calculate the current across the electrodes 
embedded in the nanopore, as described in Ref. (2^. A 
tight-binding model is used to represent the molecule and 
electrode gold atoms. For each carbon, nitrogen, oxygen, 
and phosphorus atom s—,p x —,p y —, andp z — orbitals are 
used, while s-orbitals are used for hydrogen and gold. 
The retarded Green's function, Qdna, of the system can 
then be written as 



Gdna(E) = [ES, 
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where Son A and TLdna are the overlap and the Hamilto- 
nian matrices [24| . £*({,) are the self energy terms describ- 
ing the coupling between the electrodes and the DNA. 
The total current can then be expressed as 

2e C 00 

r= T j_JET{E)[f t {E)-f h {E)l (2) 

where T(E) is the transmission coefficient and is given 
by 

T(E) = Tr[T t g DNA T b g\, NA ]. (3) 

f t n,\ is the Fermi-Dirac function of top (bottom) elec- 
trode, and r t (b) = i(St(6) — SJ,^). The electrodes are 
comprised of 3x3 gold atoms arranged as a (111) surface 
two layers deep, and are biased at 1 V. The electrode 
spacing is 12.5 A. Room temperature has been used for 
all calculations throughout the paper. 

The first question is whether it is at all possible, in the 
best case scenario, to see differences in the transverse cur- 
rent between the different nucleotides in the absence of 
structural fluctuations, ions, and water. We address this 
by studying a highly idealized case of DNA translocation 
dynamics. The transverse current of a random sequence 
of single-stranded DNA (ss-DNA) moving through the 
junction with a constant motion is shown in Figure 1. 
This figure shows that the different nucleotides do indeed 
have unique electronic signals in this ideal case. Similar 
results have been obtained for static configurations of 
nucleotides in a prev ious theoretical work by two of the 
present authors |23j, where, in addition, it was shown 
that neighboring bases do not affect the electronic signa- 
ture of a given base so long as the electrode widths are 
of nanometer scale, i.e., of the order of the base spacing. 
These results provide a good indication that DNA can be 
sequenced if its dynamics through the pore can be con- 
trolled. As we show below, such control is provided by 
a transverse field of the same magnitude as that driving 
the current. 

Obviously, in a real device there will always be fluctu- 
ations of the current. These fluctuations are mainly due 
to two sources: 1) structural fluctuations of the DNA, 
ions and water, and 2) noise associated with the electri- 
cal current itself, like thermal, shot and 1/f noise [25l |. 
Apart from 1/f noise, which can be overcome by operat- 
ing slightly away from the zero- frequency limit, we esti- 
mate that, for the case at hand, shot noise and thermal 
noise are negligibly sma ll, g iving rise to less than 0.1% of 
error in the current [2flL l2fi(. The most significant source 
of noise is thus due the structural motion of the DNA 
and its environment [27l |. 

We have explored this structural noise by coupling 
molecular dynamics simulations with electronic transport 
calculations (described above) to obtain the real-time 
transverse current of the ss-DNA translocating through 



a SisN4 nanopore j2^. The S13N4 making up the mem- 
brane is assumed to be in the /3-phase |33j with funnel- 
like shape (see Figure 2), while the electrodes are de- 
scribed above. A larger distance only reduces the current, 
while a shorter distance does not allow easy transloca- 
tion of the DNA. As we describe below, the actual ge- 
ometry of the electrodes and pore does not change the 
protocol we suggest for sequencing. The positions of the 
atoms of the nanopore and electrodes are assumed to be 
frozen throughout the simulation. The electric field gen- 
erated by the electrodes is not included when the ss-DNA 
translocates through the pore, since the driving field is 
much larger in magnitude. Its effect will be analyzed 
later. A large driving field of 10 kcal/(mol A e) is used 
to achieve feasible simulation times. In experiments such 
a large field would not be necessary. 

For convenience we choose to study the current that 
flows across two pairs of mutually-opposite electrodes 
(see inset of Figure 1). The four electrodes are not nec- 
essary for the conclusions we draw (in an experiment two 
are enough HU). However, analyzing the current in two 
perpendicular directions gives us additional information 
on the orientation of a nucleotide inside the pore. For 
instance, if the ratio between the two currents is large, 
we know the nucleotide is aligned in the direction of the 
electrodes with the larger current. If the two currents 
are about equal in both directions, it is likely that the 
base is aligned at a 45 degree angle, and so forth. This 
is illustrated by the snapshots in Figure 2 where we see 
the expected behavior of the current for an ss-DNA with 
fifteen consecutive Cytosine bases translocating between 
the two pairs of electrodes |35| . 

We have found similar curves for all other bases as 
well, making it difficult to sequence DNA on the basis 
of just a simple read-out of the current, like what these 
curves show. In other words, due to structural fluctua- 
tions and the irregular dynamics of the ss-DNA, a single 
measurement of the current for each base is not enough 
to distinguish the different bases with high precision (see 
also Supporting Material). We thus conclude that a dis- 
tribution of electrical current values for each base needs 
to be obtained. This can be done by slowing the DNA 
translocation in the pore [3j| so that each base spends a 
larger amount of time aligned with the electrodes. Most 
importantly, we find that when the field that drives the 
DNA through the pore is smaller than the transverse field 
that generates the current, one base at a time can align 
with a pair of electrodes quite easily. This is due to the 
fact that the DNA backbone is charged in solution, so 
that its position can be controlled by the transverse field 
(see also Supporting Material). 

Figure 3 shows the main results of this Letter. It 
shows the calculated distribution of transverse currents 
for each base in a realistic setting when the driving field 
is much smaller than the transverse field. We obtain 
these distributions by turning off the driving field and 
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FIG. 2: Currents as a function of time for a poly(dC) 15 
translocating through a nanopore. Blue (red) curve indicates 
the current, for a bias of 1 V, between the right and left 
(front and back) electrodes represented in gray in the snap- 
shots (the fourth electrode is located behind the field of view 
and is hence not visible in the snapshots). During approx- 
imately the first half of the translocation, the two currents 
follow each other, indicating no bases are aligned with either 
electrode pair. Left snapshot indicates the case in which a 
nucleotide is aligned with a pair of electrodes; the right snap- 
shot when the nucleotide is not aligned between either pair 
of electrodes. In the snapshots, solution atoms are not shown 
and red colors are a guide for the eye only. 

sampling the current while one base fluctuates between 
the electrodes [37] . The distributions for each base are 
indeed different. Note that these distributions may vary 
according to the microscopic geometry of the pore and 
electrodes, but our suggested protocol to sequence via 
transverse transport remains the same. First, one needs 
to "calibrate" a given nanopore device by obtaining the 
distributions of current with, say, short homogeneous 
polynucleotides, one for each base. Second, once these 
"target" distributions are obtained, a given sequence can 
be extracted with the same device by comparing the var- 
ious currents with these "target" distributions, and thus 
assigning a base to each measurement within a certain 
statistical accuracy. Both the target and sequencing dis- 
tributions need to be obtained under the conditions we 
have discussed above, i.e. the driving field smaller than 
the transverse field, which allow the transverse field to 
control the nucleotides alignment with respect to the elec- 
trodes. 
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FIG. 3: Probability distributions of currents 

at a bias of 1 V for poly(dX) 15 , where X 
is Adenine/Thymine/Cytosine/Guanine for the 
black/blue/red/green curve, respectively. The thin lines 
show the actual current intervals used for the count, while 
the thick lines are an interpolation. The inset shows the 
exponentially decaying ratio of falsely identified bases versus 
number of independent counts (measurements) of the current 
averaged over the four bases. 



Finally, given these distributions and the accuracy 
with which we want to sequence DNA, we can answer 
the question of how many independent electrical current 
measurements one needs to do in order to sequence DNA 
within that accuracy. The number of current measure- 
ments will dictate how fast we can sequence. We can eas- 
ily estimate this speed from the distributions of Figure 
3 by calculating the statistical likelihood for all configu- 
rations of a given base in the junction region and mul- 
tiplying it by the probability that we can tell this base 
from all other bases for the value of the current at that 
specific configuration. The average probability that we 
can correctly sequence a base after N measurements is 
then given by 

<^>= E jE 

X=A,T,C,G {/„} 

nli n + uLi p t + nti ps + uLi p s 

(4) 

where A, T, C and G are the distributions, as shown in 
Figure 3, for the four bases. PjJ is the probability that a 
base is X considering only the current for measurement 
n. It can be found by comparing the ratios of the four 
distributions. Finally, the sum over {I n } is a sum over 
all possible sets of measurements of size N. The inset of 
Figure 3 shows 1— < P >, the exponentially decaying 
ratio of falsely identified bases versus number of inde- 
pendent counts (measurements) of the current averaged 
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over the four bases, where the ensemble average is per- 
formed using Monte-Carlo methods. From this inset we 
see that if, for instance, we want to sequence DNA with 
an error of 0.1%, we need about 80 electrical current mea- 
surements to distinguish the four bases. If we are able 
to collect, say, 10 7 measurements of the current per sec- 
ond (a typical rate of electrical current measurements) we 
can sequence the whole genome in less than seven hours 
without parallelization. Note that it is mainly the rate 
at which electrical current measurements can be done 
that sets an upper limit for the sequencing speed, not 
the DNA translocation speed. Clearly, these estimates 
may vary with different device structures but are repre- 
sentative of the speeds attainable with this sequencing 
method. 

We thus conclude that the approach we have described 
in this Letter shows tremendous potential as an alter- 
native sequencing method. If successfully implemented, 
DNA sequencing could be performed orders of magnitude 
faster than currently available methods and still much 
faster than other pre-production approaches recently sug- 
gested mm US- 
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